-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(udf): store WASM UDF in meta store #15269
Conversation
Signed-off-by: Runji Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Cache::builder() | ||
.time_to_idle(Duration::from_secs(60)) | ||
.build() | ||
}); | ||
|
||
if let Some(runtime) = RUNTIMES.get(link).await { | ||
let md5 = md5::compute(binary); | ||
if let Some(runtime) = RUNTIMES.get(&md5) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using get_with
here? BTW, is it possible that there's an md5
collision? 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know get_with
before you mentioned it. Looks good to use.
I think the possibility of md5 collision is negligible as this is not a security critical case. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have the udf's identifier? Why bother to use md5? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are different. The identifier is used for finding functions within a WASM binary, while multiple functions may share the same binary. Here we cache the WASM runtime for each binary, but we don't want to store full binaries in memory, so their md5 is used as the cache key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. But my idea is that the case where multiple functions share the same binary is also rare. Anyway, both OK to me.
@@ -44,6 +44,7 @@ pub struct Model { | |||
pub link: Option<String>, | |||
pub identifier: Option<String>, | |||
pub body: Option<String>, | |||
pub compressed_binary: Option<Vec<u8>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an experience value that how large it can be for a WASM binary? I'm concerned about the possibility of running out of memory as all catalogs are cached in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is about 300KB after compression for our wasm binary in e2e test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Others generally LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM.
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
9e95085
to
7679177
Compare
Signed-off-by: Runji Wang <[email protected]>
What will happen if the user uploads a 100GB wasm file? |
OOM 😅. Should set a limit. |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
resolve #14568.
This PR moves WASM UDF binaries from object store to meta store. It removes system parameter
wasm_storage_url
and adds a fieldcompressed_binary
to the function catalog. WASM binaries will be compressed by zstd and then stored in the catalog.Existing two ways to create WASM functions are still supported:
But the behavior of the frontend handling it changed.
Given that Rust & WASM UDF are introduced in v1.7, if this PR can catch up with this version release, it can be considered a non-breaking change.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.